359 research outputs found
T5lephone: Bridging Speech and Text Self-supervised Models for Spoken Language Understanding via Phoneme level T5
In Spoken language understanding (SLU), a natural solution is concatenating
pre-trained speech models (e.g. HuBERT) and pretrained language models (PLM,
e.g. T5). Most previous works use pretrained language models with subword-based
tokenization. However, the granularity of input units affects the alignment of
speech model outputs and language model inputs, and PLM with character-based
tokenization is underexplored. In this work, we conduct extensive studies on
how PLMs with different tokenization strategies affect spoken language
understanding task including spoken question answering (SQA) and speech
translation (ST). We further extend the idea to create T5lephone(pronounced as
telephone), a variant of T5 that is pretrained using phonemicized text. We
initialize T5lephone with existing PLMs to pretrain it using relatively
lightweight computational resources. We reached state-of-the-art on NMSQA, and
the T5lephone model exceeds T5 with other types of units on end-to-end SQA and
ST
Zero-shot Domain-sensitive Speech Recognition with Prompt-conditioning Fine-tuning
In this work, we propose a method to create domain-sensitive speech
recognition models that utilize textual domain information by conditioning its
generation on a given text prompt. This is accomplished by fine-tuning a
pre-trained, end-to-end model (Whisper) to learn from demonstrations with
prompt examples. We show that this ability can be generalized to different
domains and even various prompt contexts, with our model gaining a Word Error
Rate (WER) reduction of up to 33% on unseen datasets from various domains, such
as medical conversation, air traffic control communication, and financial
meetings. Considering the limited availability of audio-transcript pair data,
we further extend our method to text-only fine-tuning to achieve domain
sensitivity as well as domain adaptation. We demonstrate that our text-only
fine-tuned model can also attend to various prompt contexts, with the model
reaching the most WER reduction of 29% on the medical conversation dataset.Comment: F-T Liao and Y-C Chan contributed equall
Hemispheric dispersion of radioactive plume laced with fission nuclides from the Fukushima nuclear event
Radioactivities of particulate 131I and 137Cs released from the Fukushima nuclear accident were monitored in a regional aerosol network including two high mountain sites (central Taiwan and Tibetan Plateau). The results were integrated with data measured elsewhere around the world, with special focus on the mid-latitudes. The hemispheric transport of the Fukushima radiation clouds (FRCs) by the westerlies took 3ā4 km, whereas the second one up to 5 km or more. 131I and 137Cs were fractionated during transport, with 137Cs concentrated in the shallower layer, susceptible to depositional removal, while 131I moving faster and higher. This accident may be exemplified to identify some atmospheric processes on the hemispheric scale
Bridging Speech and Textual Pre-trained Models with Unsupervised ASR
Spoken language understanding (SLU) is a task aiming to extract high-level
semantics from spoken utterances. Previous works have investigated the use of
speech self-supervised models and textual pre-trained models, which have shown
reasonable improvements to various SLU tasks. However, because of the
mismatched modalities between speech signals and text tokens, previous methods
usually need complex designs of the frameworks. This work proposes a simple yet
efficient unsupervised paradigm that connects speech and textual pre-trained
models, resulting in an unsupervised speech-to-semantic pre-trained model for
various tasks in SLU. To be specific, we propose to use unsupervised automatic
speech recognition (ASR) as a connector that bridges different modalities used
in speech and textual pre-trained models. Our experiments show that
unsupervised ASR itself can improve the representations from speech
self-supervised models. More importantly, it is shown as an efficient connector
between speech and textual pre-trained models, improving the performances of
five different SLU tasks. Notably, on spoken question answering, we reach the
state-of-the-art result over the challenging NMSQA benchmark.Comment: ICASSP2023 submissio
Extending the Pre-Training of BLOOM for Improved Support of Traditional Chinese: Models, Methods and Results
In this paper we present the multilingual language model BLOOM-zh that
features enhanced support for Traditional Chinese. BLOOM-zh has its origins in
the open-source BLOOM models presented by BigScience in 2022. Starting from
released models, we extended the pre-training of BLOOM by additional 7.4
billion tokens in Traditional Chinese and English covering a variety of domains
such as news articles, books, encyclopedias, educational materials as well as
spoken language. In order to show the properties of BLOOM-zh, both existing and
newly created benchmark scenarios are used for evaluating the performance.
BLOOM-zh outperforms its predecessor on most Traditional Chinese benchmarks
while maintaining its English capability. We release all our models to the
research community
A research roadmap for quantifying non-state and subnational climate mitigation action
Non-state and subnational climate actors have become central to global climate change governance. Quantitatively assessing climate mitigation undertaken by these entities is critical to understand the credibility of this trend. In this Perspective, we make recommendations regarding five main areas of research and methodological development related to evaluating non-state and subnational climate actions: defining clear boundaries and terminology; use of common methodologies to aggregate and assess non-state and subnational contributions; systematically dealing with issues of overlap; estimating the likelihood of implementation; and addressing data gaps
Initial Visible and Mid-IR Characterization of P/2019 LDā (ATLAS), an Active Transitioning Centaur Among the Trojans, with Hubble, Spitzer, ZTF, Keck, APO and GROWTH Imaging and Spectroscopy
We present visible and mid-infrared imagery and photometry of Jovian co-orbital comet P/2019 LDā (ATLAS) taken with Hubble Space Telescope/WFC3 on 2020 April 1, Spitzer Space Telescope/IRAC on 2020 January 25, Zwicky Transient Facility between 2019 April 9 and 2019 Nov 8 and the GROWTH telescope network from 2020 May to July, as well as visible spectroscopy from Keck/LRIS on 2020 August 19. Our observations indicate that LDā has a nucleus with radius 0.2-1.8 km assuming a 0.08 albedo and that the coma is dominated by ā¼100 Ī¼ m-scale dust ejected at ā¼1 m/s speeds with a ā¼1" jet pointing in the SW direction. LDā experienced a total dust mass loss of ā¼10āø kg and dust mass loss rate of ā¼6 kg/s with AfĻ/cross-section varying between ā¼85 cm/125 kmĀ² and ā¼200 cm/310 kmĀ² between 2019 April 9 and 2019 Nov 8. If the AfĻ/cross-section increase remained constant, it implies that LDā has remained active since ā¼2018 November when it came within 4.8 au of the Sun, a typical distance for comets to begin sublimation of HāO. From our 4.5 Ī¼m Spitzer observations, we set a limit on CO/COā gas production of ā¼10Ā²ā·/ā¼10Ā²ā¶ mol/s. Multiple bandpass photometry of LDā taken by the GROWTH network measured in a 10,000 km aperture provide color measurements of g-r = 0.59Ā±0.03, r-i = 0.18Ā±0.05, and i-z = 0.01Ā±0.07, colors typical of comets. We set a spectroscopic upper limit to the production of HāO gas of ā¼80 kg/s. Improving the orbital solution for LDā with our observations, we determine that the long-term orbit of LDā is that of a typical Jupiter Family Comet having close encounters with Jupiter coming within ā¼0.5 Hill radius in the last ā¼3 y to within 0.8 Hill radius in ā¼9 y and has a 95% chance of being ejected from the Solar System in < 10 Myr
Pathogenic LRRK2 Mutations Do Not Alter Gene Expression in Cell Model Systems or Human Brain Tissue
Point mutations in LRRK2 cause autosomal dominant Parkinson's disease. Despite extensive efforts to determine the mechanism of cell death in patients with LRRK2 mutations, the aetiology of LRRK2 PD is not well understood. To examine possible alterations in gene expression linked to the presence of LRRK2 mutations, we carried out a case versus control analysis of global gene expression in three systems: fibroblasts isolated from LRRK2 mutation carriers and healthy, non-mutation carrying controls; brain tissue from G2019S mutation carriers and controls; and HEK293 inducible LRRK2 wild type and mutant cell lines. No significant alteration in gene expression was found in these systems following correction for multiple testing. These data suggest that any alterations in basal gene expression in fibroblasts or cell lines containing mutations in LRRK2 are likely to be quantitatively small. This work suggests that LRRK2 is unlikely to play a direct role in modulation of gene expression, although it remains possible that this protein can influence mRNA expression under pathogenic cicumstances
- ā¦